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The international response to SARS-CoV has produced an outstanding number of protein structures in 
a very short time. This review summarizes the findings of functional and structural studies including 
those derived from cryoelectron microscopy, small angle X-ray scattering, NMR spectroscopy, and X- 
ray crystallography, and incorporates bioinformatics predictions where no structural data is available. 
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pathogenesis are highlighted. The high percentage of novel protein folds identified among SARS-CoV 


© 2013 Published by Elsevier B.V. 


1. Introduction 


In the wake of the SARS crisis, a wave of structural proteomics 
swept the coronavirus research community. The focus of this effort 
was to understand the interplay between structure and function in 
what had been, until that time, a somewhat neglected branch of 
the positive-stranded viruses. The unusual aspect of the SARS pro- 
teomics at the time was its evenhandedness - rather than focusing 
exclusively on proteins with well-defined roles in pathogenesis, 
competing international teams attempted to solve structures and 
assign functions across the entire viral proteome. 

This effort brought fresh attention to several little-known 
replicase cofactors, such as the European group’s structure of 
the obscure but important RNA binding protein nsp9 (Egloff 
et al., 2004), the Chinese group’s barrel-shaped 16-protein struc- 
ture of nsp7+8 primase complex (Zhai et al., 2005) and the 
American group’s long crawl through the giant multi-domain, 
multi-enzymatic protein nsp3 which found the first of three SARS- 
CoV macrodomain folds (Saikatendu et al., 2005). 

Shortly after the outbreak, the sequence of the genome was com- 
pleted and the 3-D structure of MP'°, the main protease essential 
for viral replication, was deposited in the Protein Data Bank (PDB). 
By 2007, 100 entries in the PDB were on 14 of the 28 SARS CoV 
proteins, and at present count there are 99 structures of coronavi- 
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rus MP'° available in the PDB alone, providing an unprecedented 
database for investigators working on this and related viruses. This 
review summarizes the findings of functional and structural studies 
including those derived from cryoelectron microscopy, small angle 
X-ray scattering, NMR spectroscopy, and X-ray crystallography in 
an attempt to understand the function and biological roles of the 
proteins in viral replication and pathogenesis. 


2. Anote on functional organization 


The new wealth of structural and functional information 
revealed that the coronavirus replicase, which is but one biolog- 
ically successful example of the conserved nidovirus replicative 
machinery (Lauber et al., 2013), is not a patchwork amalgam of 
evolutionary jetsam, but an organized piece of biological machin- 
ery where proteins are generally organized into units with related 
functions (Fig. 1). The first two parts of the replicase, nsp1 and 
nsp2 are somewhat enigmatic, but appear to work by interfering 
with host defenses rather than by directly supporting virus replica- 
tion. Subunits nsp3-6 contain all the viral factors that are necessary 
to form viral replicative organelles (Angelini et al., 2013), as well 
as two proteinases that are responsible for processing all of the 
viral replicase proteins (Ziebuhr et al., 2000). The small subunits 
nsp7-11 comprise the viral primer-making activities and provide 
other essential support for replication (Donaldson et al., 2007b; 
Imbert et al., 2006; Miknis et al., 2009). The final part of the replicase 
from nsp12-16 contains the remaining RNA-modifying enzymes 
needed for replication, RNA capping and proofreading. 
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Fig. 1. Conservation of the SARS-CoV replicase. Replicase subunits, or domains for nsp3, were color-coded according to percent identity between homologous proteins of 
SARS-CoV and MERS-CoV. Alignments and identity calculations were performed using Clustal Omega (Sievers et al., 2011). 


The organization of replicase has a sort of chronological logic 
to it. Nsp1-2 help to colonize the host, followed by Nsp3-6 which 
lay a foundation to organize and protect the replicative machin- 
ery. This is followed by the primer-making activities of nsp7-11 
which also interact with downstream capping and RNA synthesis 
factors (Bouvet et al., 2010). Finally, in the proper framework, the 
RNA-synthesizing enzymes from the C-terminus of the replicase 
are able to function. While this may be an appealing way to think 
of the replicase, the reality is probably much more complex. The 
replicase proteins are all processed from large polyproteins, and 
therefore are produced at the same time. Because of this, the order 
in which different proteins are active during the viral replication 
cycle remains poorly understood. 

The organization of the replicase also roughly follows a gradient 
of primary sequence conservation. Levels of sequence conserva- 
tion among the different coronaviruses are highest at the 3’ end of 
the replicase gene, and the sequences are very divergent at the 5’ 
end, especially in nsp1-3, which are products of nsp3 PLP'° cleav- 
age. The DMV-making proteins and the primase group of proteins 
show intermediate levels of conservation with the exception of the 
well-conserved nsp5MP"°, Fig. 1 illustrates amino acid conserva- 
tion across the replicase using the comparison between SARS-CoV 
and MERS-CoV as an example. 


3. The Atlas 
3.1. Nsp1 
See Box 1. 


3.1.1, Structure 

Nsp1 is the N-terminal cleavage product of the replicase 
polyprotein and is produced by the action of PLP’°. Nsp1 is not found 
in the gammacoronavirus or deltacoronavirus lineages, which 
encode a distant homolog of SARS-CoV nsp2 at the N-terminus of 
the replicase po lyprotein. This has led to a suggestion that nsp1 is 
useful as a group-specific marker (Snijder et al., 2003). SARS-CoV 
nsp1 is 179 residues long. 

In the alphacoronaviruses, nsp1 (also known as p9) is a pro- 
tein of about 110 residues, with 20-50% sequence identity among 
all alphacoronaviruses. The betacoronaviruses of subgroup A, such 
as murine hepatitis virus (MHV) and human coronavirus OC43, 
encode an nsp1 protein of about 245 residues, also known as p28 
(Brockway and Denison, 2005). The nsp1 of SARS-CoV and its bat 


Box 1: Key nsp1 and nsp2 structures 


Virus Protein Method Accession Reference 
SARS-CoV nsp1 NMR 2HSX Almeida et al. (2007) 
TGEV nsp1 X-ray (1.49 A) 3ZBD Jansson (2013) 

IBV nsp2 X-Ray (2.5 A) 3LD1 Yu et al. (2012) 


Please cite this article in press as: Neuman, B.W., 


http://dx.doi.org/10.1016/j.virusres.2013.12.004 


et al. Atlas 


equivalents, which have been classified as the only members to 
date of the betacoronavirus subgroup B (Gorbalenya et al., 2006; 
Gorbalenya et al., 2004; Snijder et al., 2003), have 179 residues. 
Nsp1 sequences are divergent between subgroups of betacoron- 
avirus, and no sequence similarity between SARS-CoV nsp1 and 
betacoronavirus subgroup A nsp1 proteins could be identified using 
standard searching tools such as PSI-BLAST. 

Almeida et al. (2006, 2007) determined the NMR structure of 
the nsp1 segment from residue 13 to 128 and also showed that the 
polypeptide segments of residues 1-12 and 129-179 are flexibly 
disordered (PDB ID 2GDT; 2HSX) (Almeida et al., 2007). Residues 
13-128 of nsp1 represents a novel a/B-fold formed by a mixed 
parallel/antiparallel 6-stranded B-barrel, an a-helix covering one 
opening of the barrel, and a 3;9-helix alongside the barrel (Fig. 2). 
NMR data indicate that full-length nsp1 has the same globular 
fold as the truncated nsp1, but with additional flexibly disordered 
regions that correspond to the N-terminal region (residues 1-12) 
and the long C-terminal tail (residues 129-179). 

The C-terminal portion of SARS-CoV nsp1 is flexibly disordered. 
Interestingly, it has been determined that the C-terminal half of 
MHV nsp1 (Lys124-Leu241) is dispensable for viral replication in 
culture but is important for efficient proteolytic cleavage at the 
nsp1-—2 peptide linkage by the papain-like protease and optimal 
viral replication (Brockway and Denison, 2005). Likewise, the long 
disordered terminus of SARS-CoV nsp1 are probably important for 
the efficient proteolytic processing of this protein from the nascent 
viral polyprotein chain. 

The nsp1 of transmissible gastroenteritis virus (TGEV) was 
recently solved, and was found to contain a similar fold to SARS-CoV 
nsp1 (Jansson, 2013). This was surprising as there is no detectable 
homology between alphacoronavirus nsp1 proteins and betacoro- 
Nnavirus nsp1 proteins. However, the relationship of the structures 
suggests that coronavirus nsp1 proteins share a common evolu- 
tionary origin. 


3.1.2. Function 
In several coronaviruses, nsp1 suppresses host gene expression 
(Huang et al., 2011; Kamitani et al., 2006; Narayanan et al., 2008; 


Fig. 2. Comparison of nsp1 structure in alpha- and betacoronavirus lineages. The 
SARS-CoV nsp1 structure comes from PDB entry 2HSX, and the TGEV nsp1 structure 
comes from PDB entry 3ZBD. 
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Fig. 3. Bioinformatics detection of homology and domain organization in nsp2. The structure of IBV nsp2 is taken from PDB entry 3LD1. Predicted secondary structures for 
representative coronavirus lineages were made by PsiPRED 3.0 (McGuffin et al., 2000). Rectangles represent a-helices and arrows represent beta-strands. 


Zust et al., 2007a). The mechanism of nsp1-mediated host suppres- 
sion remains a topic of active research. SARS-CoV nsp1 binds to 40S 
subunits and exerts suppression of host gene expression (Kamitani 
et al., 2009). It has also been shown that SARS-CoV nsp1 promotes 
host mRNA degradation but coronavirus RNA species are protected 
from degradation (Kamitani et al., 2009; Tanaka et al., 2012). 

In MHV, nsp1 arrests the cell cycle of transfected cells in GO/G1 
phase (Chen et al., 2004). MHV mutants that are incapable of liberat- 
ing nsp1 from the nascent polyprotein exhibit delayed replication, 
diminished peak titers, and reduced RNA synthesis compared to 
wild-type controls (Denison et al., 2004). A point mutation in the 
proteolytic cleavage site between nsp1 and nsp2 in full-length 
TGEV genome blocks the release of nsp1 from the nascent polypro- 
tein and causes a dramatic reduction in virus recovery (Galan et al., 
2005). 

Kamitani et al., have shown that plasmid-driven expression of 
nsp1 (driven by SV40, CMV and IFN-B promoters) sharply reduces 
protein expression (Kamitani et al., 2006). This correlates with 
reduction in the specific mRNA, whereas rRNA remained unaf- 
fected. More generally, transfected nsp1 mRNA that was capped 
and polyadenylated decreased host protein synthesis, and the 
inclusion of actinomycin D (to block new transcription) showed 
a much stronger inhibition of protein synthesis in the presence of 
nsp1, demonstrating that that while translation of new transcripts 
was proceeding (in cells not treated with actinomycin D), transla- 
tion from pre-existing transcripts was blocked by nsp1. Decreased 
MRNA levels and decreased translation of pre-existing mRNA, pre- 
sumably as a result of degradation, were also seen during infection 
with SARS-CoV. 

SARS-CoV nsp1 has also been shown to be a potent inducer of 
CCL5, CXCL10 and CCL3 expression in human lung epithelial cells 
via the activation of NF-KB (Law et al., 2007). The pathogenesis of 
SARS-CoV infection is characterized by a hyperimmune response 
and the massive elevation of chemokine levels. In contrast, HCoV- 
229E, HCoV-OC43, and MHV did not significantly induce chemokine 
expression, perhaps because these only cause mild upper respira- 
tory tract diseases. 

The kinetics of nsp1 expression suggests that it might have 
an early regulatory role during the viral life cycle. Nsp1 is the 
first mature protein processed from the gene 1 polyprotein and 
is likely cleaved quickly following translation of PL1?'° within nsp3 
(Baker et al., 1989; Baker et al., 1993; Denison and Perlman, 1987; 
Denison et al., 1992, 1995; Denison and Perlman, 1986). MHV 
mutants that are incapable of liberating nsp1 from the nascent 
polyprotein exhibit delayed replication, diminished peak titers, 
small plaques, and reduced RNA synthesis compared to wild-type 
controls (Denison et al., 2004). These results emphasize the impor- 
tance of nsp1 cleavage for optimal viral RNA synthesis and suggest 
that nsp1 might play an important role at MHV replication com- 
plexes. However, later in infection, nsp1 is distinct from replication 
complexes and instead co-localizes with MHV structural proteins 
at virion assembly sites (Brockway et al., 2004). 

In MHV, nsp1 interacts with p10 and p15 (counterparts of SARS 
nsp7 and nsp10, respectively; (Brockway et al., 2004)). Previous 
immunolocalization and interaction studies in MHV have also indi- 
cated that in vivo, nsp1 may act in concert with numerous other 
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viral proteins — counterparts of SARS nsp2, 5, 8, 9, 12, 13 and 
sars9a (Bost et al., 2000; Brockway et al., 2004). However, at some 
stages in the MHV life cycle, nsp1 has spatially different membrane 
localization from p65, the SARS nsp2 counterpart. It appears that 
later in infection, MHV nsp1 co-localizes with structural proteins 
at virion assembly sites rather than with the replication complexes 
(Brockway et al., 2004). Y2H and co-immunoprecipitation studies 
indicate that nsp1 interacts with E, and sars3a (von Brunn et al., 
2007). 


3.2. Nsp2 


3.2.1. Structure 

SARS-CoV nsp2 is a counterpart of the p65 protein (Denison 
et al., 1995) of MHV. As with nsp1, sequence homology is very low 
and does not permit confident sequence alignment. However, as 
with nsp1, bioinformatics gives an indication that coronavirus nsp2 
proteins share a common fold and origin (Fig. 3). The structure of 
the N-terminal 359 amino acids of the IBV equivalent of nsp2 has 
been released, but is currently awaiting full publication, though a 
crystallization report is available (Yang et al., 2009; Yu et al., 2012). 
Crystallization of part of the SARS-CoV nsp2 (Li et al., 2011) has also 
been reported, but the structure is not currently available pending 
full publication. The solved region of IBV nsp2 comprises about half 
the protein. It represents a novel multi-domain fold, though further 
structural and functional details await full publication. 

Interestingly, secondary structure prediction suggests that coro- 
Nnavirus nsp2 proteins consist of a duplicated fold, with a second, 
more conserved fold similar to the structure 3LD1 immediately fol- 
lowing the region solved in 3LD1 (Fig. 3). This is seen most clearly 
in the gammacoronaviruses and deltacoronaviruses. This would fit 
the context of domain and fold duplication at the N-terminal part of 
the replicase polyprotein which has been observed across the coro- 
naviridae, which includes duplicated ubiquitin-like, papain-like, 
and macrodomain folds (Neuman et al., 2008). 


3.2.2. Function 

The functions of nsp2 remain unknown. In MHV p65 has spa- 
tially different membrane localization from nsp1 and co-localizes 
with the MHV homologue of SARS nsp8 (Sims et al., 2000). In MHV, 
p65 plays an important role in the viral life cycle (Hughes et al., 
1993) that appears to be distinct from that of its counterparts in 
other coronaviruses (Bost et al., 2001; Denison et al., 2004; Sims 
et al., 2000). Based on immunolocalization studies in MHV, p65 
may function in concert with counterparts of SARS nsp1, 5, 7, 8, 9, 
10, 12, 13 and sars9a (Bost et al., 2000; Brockway et al., 2004). Dele- 
tion mutagenesis with infectious clones of SARS and MHV indicated 
that nsp2 is dispensable for viral replication in cell culture; how- 
ever, deletion of the nsp2 coding sequence attenuates viral growth 
and RNA synthesis. 

The exact nature of the role of nsp2 in viral growth and RNA syn- 
thesis is still not clear. However, IBV nsp2 has a weak PKR antagonist 
activity, which may hint at a role complementary to that of nsp1 in 
interfering with intracellular immunity. A proteomics study with 
full-length SARS-CoV nsp2 also found that nsp2 bound prohibitin 
1 and prohibitin 2, which could contribute to the hypothetical role 
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Fig. 4. Domain architecture and structure of the conserved DMV-making proteins nsp3 through nsp6. A partially schematic view combining the known domain structures 
of SARS-CoV nsp3 is shown in panel A, combining the PDB entries 2IDY, 2ACF, 2FE8, 2KAF, 2JZF, 2W2G and 2K87. Conservation of structural domains and other important 
sequence features is shown for four representative coronaviruses in panel B. Structures of the C-terminal domain of FCoV nsp4 (3GZF), SARS-CoV nsp5 (1UJ1), TGEV nsp5 


(1LVO) and IBV nsp5 (2Q6F) are shown in panels C-F, respectively. 


of nsp2 in counteracting intracellular immunity (Cornillez-Ty et al., 
2009). Based on immunolocalization studies in MHV, p65 may func- 
tion in concert with counterparts of SARS nsp1, 5, 7, 8, 9, 10, 12, 13 
and sars9a (Bost et al., 2000; Brockway et al., 2004). 


3.3. Nsp3 


3.3.1. Structure 

SARS nsp3 is a large multidomain protein with 1922 residues 
(Snijder et al., 2003; Thiel et al., 2003). Nsp3, with nsp4, nsp5 and 
nsp6 forms a conserved block of proteins that are involved in form- 
ing the double-membrane vesicles that are the site of viral RNA 
synthesis (Fig. 4). Every completely sequenced coronavirus has an 
nsp3-related protein. All nsp3s are ~200kDa, cleaved from the 
polyprotein 1a or 1ab by PLP”°. 

We have compiled a higher-resolution analysis of nsp3 domain 
architecture as a tool for novel structural and functional charac- 
terization (Neuman et al., 2008). Based on phylogenetic analysis 
of coronavirus and torovirus nsp3 homologues, results from pre- 
viously published studies (Gorbalenya et al., 2006; Ratia et al., 
2006; Saikatendu et al., 2005; Serrano et al., 2007; Thiel et al., 
2003; Ziebuhr et al., 2001) and de novo domain prediction soft- 
ware (Jaroszewski et al., 2005), we estimate that SARS-CoV nsp3 has 
about 14 domains — UB1, AC (it is missing PL1?"° found in several 
other CoVs), ADRP, SUD-N, SUD-M, SUDC, UB2, PL2P'°, NAB, G2M, 
TM1, ZF, TM2, and Y, which may contain three structural domains. 
A partially schematic model of the nsp3 structure is shown (Fig. 3). 
Inferring from the presence of PL2P'° cleavage sites at both termini 
of nsp3, the observed glycosylation at positions 1431 and 1434 in 
the ZF domain of SARS-CoV (Harcourt et al., 2004) and the homol- 
ogous region of MHV (Kanjanahaluethai et al., 2007). SARS-CoV 
nsp3 contains two transmembrane spans, placing the first 1395 
residues (including the PL2P!° domain), and the last 377 residues 
(the Y domain) on the same face of the membrane (Oostra et al., 


Please cite this article in press as: Neuman, B.W., 


http://dx.doi.org/10.1016/j.virusres.2013.12.004 


et al. 


2008). The two TM helices probably consist of residues 1396-1418 
and 1523-1545. This transmembrane topology is similar to that 
proposed for MHV nsp3 (Kanjanahaluethai et al., 2007). Between 
helices two and three, there is a central, absolutely conserved tetrad 
of cysteines (CX44_1gC4-5C2C) —- which may represent a Zn finger - 
which is likely on the same side of the membrane as the domains 
N- and C-terminal to the TM region. 


3.3.2. Function 

Although the function of the N-terminal region of polyprotein 
1a/polyprotein 1ab is not known, both the transcription-negative 
phenotypeof an alphavirus X domain mutant (von Brunn et al., 
2007) and the conservation of a transcription factor-like zinc fin- 
ger in coronavirus PLP’° domains (Culver et al., 1993) indicated that 
nsp3 might be involved in coronavirus RNA synthesis. This hypoth- 
esis is strongly supported by a report in which the equine arteritis 
virus nonstructural protein 1, which, most probably, is a distant 
homolog of the coronavirus PLP’°, is shown to be a transcriptional 
factor that is indispensable for sg mRNA synthesis (Phizicky and 
Greer, 1993). 


3.4. UB1 and AC 


3.4.1. Structure 

The sequence of the N-terminal domain of nsp3 (1-183) is highly 
conserved in different SARS coronavirus isolates but shows less 
than 25% of sequence identity with other known proteins. This 
region exhibits two well defined regions with different physico- 
chemical and structural properties. NMR was used to determine the 
structure of the N-terminal domain (residues 1-110); this exhibits 
a ubiquitin-like fold with two additional helices which make the 
overall structure of this domain (UB1 domain) more elongated than 
other ubiquitin-like proteins (Serrano et al., 2007). NMR studies 
revealed that the highly acidic (51% Glu/Asp residues) C-terminal 
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domain (residues 111-183; AC domain) is structurally disordered 
(Serrano et al., 2007). 


3.4.2. Function 

UB1 has high structural homology to Ras-interacting proteins 
such as the Ras-interacting domain (RID) of RALGDS, a member 
of the RA family and conservation of residues important for the 
interaction with Ras. Ras family proteins (RFPs) act as molecular 
switches that cycle between inactive GDP- and active GTP-bound 
states. RFPs control cell growth, motility, intracellular transport 
and differentiation. Ras plays a fundamental role in cell progres- 
sion from phase Go to G; (Dobrowolski et al., 1994; Peeper et al., 
1997) Molecular interactions that result in a Ras inactivation avoid 
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Box 2: Key nsp3 and nsp4 structures 

Virus Domain Method Accession Reference 
SARS-CoV UB1andAc NMR 2IDY Serrano et al. (2007) 
HCoV-229E ADRP X-ray (2.0A) 3EWR Xu et al. (2009) 
FCoV ADRP X-ray (3.9A) 3JZT Wojdyla et al. (2009) 
IBV ADRP X-ray (2.0A) 3EWP Xu et al. (2009) 
HCoV-NL63 ADRP X-ray (1.9A) 2VRI Awaiting publication 
SARS-CoV ADRP X-ray (1.4A)  2ACF Saikatendu et al. (2005) 
TGEV PL1pre X-ray (2.5A) 3MP2 Wojdyla et al. (2010) 
SARS-CoV UB2-PL2P°° X-ray (1.9 A) 2FE8 Ratia et al. (2006) 
SARS-CoV SUD-N-M X-ray (2.2A) 2W2G Tan et al. (2009) 
SARS-CoV SUD-M NMR 2J5ZF Chatterjee et al. (2009) 
SARS-CoV SUD-C NMR 2KAF Johnson et al. (2010) 
SARS-CoV NAB NMR 2K87 Serrano et al. (2009) 
FCoV nsp4-CTD X-ray (2.8A) 3GZF Manolaridis et al. (2009) 


cell progression to G1 phase. SARS-CoV and other coronaviruses 
such as MHV are able to induce cell arrest in Go/G; phase during 
the lytic infection cycles for their own replication advantage (Chen 
and Makino, 2004; Yuan et al., 2005). Sars3b plays a role in this pro- 
cess (Yuan et al., 2005) and nsp3 may also be involved in arresting 
the cell cycle arrest in the Go phase. 

Additionally, UB1 has structural homology with ISG15, an 
interferon-induced protein constitutively present in higher eukary- 
otes. This protein conjugates with cellular targets as a primary 
response to interferon-a/® induction and other markers of viral or 
parasitic infection. High levels of this protein are essential for cellu- 
lar antiviral response. It is known that ISG15 is able to inhibit virus 
replication by abrogating nuclear processing of unspliced viral RNA 
precursors. However, some viruses have developed a mechanism to 
avoid the expression of ISG15. For example, influenza B virus blocks 
its expression by means of NS1 protein in order to overcome the 
immune response. It is possible that the PL2P' domain of nsp3 may 
bind ISG15 and subvert the antiviral response of the cell. 

The functional significance of RNA-binding by this domain is 
unknown. It possesses a ubiquitin fold, as does the domain N- 
terminal to the PLP’? domain. We also do not know if this fact 
has any functional significance. Also, the predicted function of this 
domain based on its similarity to Ras-binding proteins and ISG15 
remains to be experimentally validated. 

NMR experiments indicated a ligand bound to UB1, which was 
identified as a small RNA fragment by mass spectrometry. NMR 
studies have identified the interacting molecular interfaces. UB1 
of MHV has recently been shown to interact with the nucleopro- 
tein, effectively tethering nsp3 to viral RNA during the replication 
process (Hurst et al., 2013). This activity does not require the AC 
domain that follows UB1 and is hypervariable. 


3.5. ADRP/Macro1 


3.5.1. Structure 

The crystal structure of a construct consisting of residues 
184-365 has been determined for SARS-CoV (Saikatendu et al., 
2005), and the corresponding region has since been solved for sev- 
eral other coronaviruses (see Box 2). This region of nsp3 adopts 
a macro H2A domain fold. The putative active site and substrate- 
binding residues were conserved in its three structural homologues 
yeast Ymx7, Archaeoglobus fulgidus AF1521 and Er58 from E. coli, 
and its sequence homologue, yeast YBRO22W, a known phos- 
phatase that acts on ADP ribose-1”-phosphate (Appr-1”-p or ADRP). 
The notable exception is that proposed active site residue Asp90 in 
YMxX7 is an alanine in both the SARS-CoV ADRP (Ala50) and AF1521 
(Ala44). Histidine residues in both enzymes proximal to the ter- 
minal 1” phosphate of the substrate (His45 in ADRP and His39 
in AF1521) might therefore be involved in catalysis (Saikatendu 
et al., 2005). Alternatively, the predominant nucleophile in the cat- 
alytic site may actually be an Asp or Glu in the conformationally 
flexible loop 19; NAGEDIQ 97 in SARS-CoV and the corresponding 
region in other coronaviral ADRPs (Saikatendu et al., 2005). The 
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former proposition was verified by site directed mutagenesis data 
in the HCoV-229E ADRP, which showed that residues Asn37, Asn40, 
His45, Gly44 and Gly48 are part of the active site in the SARS-CoV 
ADRP (Putics et al., 2005). 


3.5.2, Function 

The SARS ADRP readily hydrolyzes the 1” phosphate group 
from Appr-1”-p in vitro demonstrating that it is an active enzyme 
(Saikatendu et al., 2005). Another group validated this finding: both 
the SARS ADRP and the human coronavirus HCoV-229E counter- 
part were shown to dephosphorylate Appr-1”-p to ADP-ribose ina 
highly specific manner, the enzyme having no detectable activity 
on several other nucleoside phosphates (Putics et al., 2005). 

The role of an ADRP in the coronavirus life cycle may closely 
parallel that in the eukaryotic tRNA splicing pathway (Culver et al., 
1993; Phizicky and Greer, 1993; Saikatendu et al., 2005; Snijder 
et al., 2003). In coronaviruses, an early post infection event is 
the transcription of a nested set of sub-genomic mRNAs. Each 
sub-genomic mRNA contains a short 5’-terminal ‘leader’ sequence 
derived from the 5’ end of the genome (Lai and Holmes, 2001a,b; 
Thiel et al., 2003). The fusion of the two noncontiguous RNA seg- 
ments is a poorly understood process. It is thought to be achieved 
by a discontinuous step in the synthesis of the minus-strand and 
involves transcription regulatory sequences (Pasternak et al., 2001; 
Thiel et al., 2003). In eukaryotes, pre-tRNA splicing is initiated 
by cleavage at the splice site by an endonuclease. The result- 
ing tRNA fragments are then ligated to yield mature tRNA that 
retains the 2’ phosphomonoester group at the splice site (Phizicky 
and Greer, 1993). Using NAD as an acceptor, a phosphotrans- 
ferase removes the 2’ phosphate to yield ADP-ribose-1”-2” cyclic 
phosphate (Culver et al., 1994). A cyclic phosphodiesterase then 
hydrolyzes Appr>p to yield Appr-1”-p (Culver et al., 1994) (Martzen 
et al., 1999). Finally, a phosphatase converts Appr-1”-p into ADP- 
ribose and releasing inorganic phosphate. While the equivalent for 
the cyclic phosphodiesterase appears absent in the SARS proteome, 
the Appr-1”-p phosphatase (SARS ADRP) and an endonuclease 
(nsp15) are present. Characterization of an Appr-1”-phosphatase- 
deficient HCoV-229E mutant revealed no significant effects on viral 
RNA synthesis and virus titer (Putics et al., 2005). 

Egloff et al. (2006) suggested that ADRP may primarily be a poly- 
ADP-ribose binding (PAR-binding) module. PARylation occurs in 
compromised cells to trigger apoptosis. PAR polymerases (PARPs) 
are responsible for so tagging proteins. PARP is activated on rec- 
ognizing nicked DNA, and it helps in DNA repair. It auto-PARylates 
itself, and in case of extreme DNA damage, gets overactivated and 
depletes the cell of its nucleotide pool. If ADRP binds PAR, then it 
can bind proteins that are PARylated, including PARP. Indeed, bind- 
ing the latter may be most beneficial, since it can tether down this 
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protein, slow down apoptosis, and prevent nucleotide depletion, 
prolonging viral replication and transcription in the infected cell. 

The existence of the ADRP domain in all CoV nsp3s (as well as in 
several other viruses) argues for its critical role in the viral life cycle. 
Its function as an ADRP in recycling organic phosphate appears to 
be a dispensable function and does not correlate with the conser- 
vation of this domain. It appears that its role as an ADRP may be 
secondary to a more important role - such as its proposed role as a 
PAR-binding module. If so, the role of PAR-binding in the viral life 
cycle needs to be delineated experimentally. 


3.6. SARS-unique domain 


3.6.1. Structure 

The region corresponding to residues 366 to 722 has been con- 
sidered a domain unique to SARS-CoV and is called the SUD (SARS 
Unique Domain). The corresponding regions which are located just 
downstream of the ADRP domain in HCoV-NL63, PEDV, HCoV- 
229E (among alphacoronaviruses) and all Betacoronaviruses have 
no assigned domain prediction, but secondary structure predic- 
tion suggests the presence of an additional macrodomain fold 
(Chatterjee et al., 2009). 

It has been demonstrated that the SUD may actually consist 
of three domains, termed by position: the N-terminal SUD-N, the 
middle SUD-M and the C-terminal SUD-C Deuterium exchange 
mass spectrometry data on a construct nsp3:451-651 initially 
appeared to support this notion, indicating that a well ordered 
domain exists from residues 523-651. Constructs representing 
nsp3:365-722 have been shown to be particularly susceptible to 
proteolysis (Stefanie Tech et al., 2004). Size exclusion chromatog- 
raphy and PFO-PAGE indicates that it forms a dimer in solution and 
1D-NMR spectra show that is well-folded. 

The structure of SUD-N and SUD-M has been solved and sur- 
prisingly each domain was found to contain a macrodomain fold 
that was a close structural match for the SARS-CoV ADRP/Macro1 
domain despite a lack of detectable amino acid homology between 
these proteins (Tan et al., 2009). The presence of these additional 
macrodomain folds has also been confirmed by the NMR structure 
of the complete SUD (Johnson et al., 2010) and the NMR structure 
of SUD-M (Chatterjee et al., 2009). The SUD-C domain contained 
a novel fold that consisted of an antiparallel beta sheet (Johnson 
et al., 2010). 


3.6.2. Function 

All three of the domains that make up the SUD have been 
demonstrated to interact with nucleic acid in some way. The SUD- 
NM has a high affinity for G-rich sequences and G-quadruplexes 
(Tan et al., 2009), while the SUD-MC showed a general preference 
for purine nucleotides (Johnson et al., 2010). While the SUD-N and 
SUD-M domains bear a close structural resemblance to the SARS- 
CoV ADRP domain, neither domain has any demonstrable affinity 
for ADP-ribose (Tan et al., 2009). The amino acids responsible for 
SUD-M and SUD-C RNA binding have been mapped, and appear to 
fall near the region of SUD-M that corresponds to the active site 
in the structurally similar ADRP domain (Chatterjee et al., 2009). 
Together this suggests that the cluster of three macrodomains in 
SARS-CoV nsp3 arose through gene duplication and that the SUD 
may contribute to the function of nsp3 as an accessory to the viral 
replication process (Neuman et al., 2008). 


3.7. UB2 and PL2P'° 


3.7.1. Structure 

Unlike many coronaviruses that encode two papain-like pro- 
tease, SARS-CoV has a single copy of papain-like cysteine protei ase 
(PL2P'°) that cleaves polyprotein 1a at three sites at the N-terminus 
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(477LNGGJAVT183, 815LKGGJAPIg21, and 2737LKGGJKIV2743) to 
release nsp1, nsp2, and nsp3, respectively (Harcourt et al., 2004; 
Thiel et al., 2003). SARS-CoV PL2P'° is also a deubiquitinat- 
ing enzyme; it efficiently disassembles diubiquitin and branched 
polyubiquitin chains, cleaves ubiquitin-AMC substrates, and has 
de-ISGylating activity (Chen et al., 2007b; Lindner et al., 2005). 
Thus, PL2P'° may have critical roles not only in proteolytic 
processing of the replicase complex but also in subverting cellular 
ubiquitination machinery to facilitate viral replication. Thestruc- 
ture of a PL2P'° construct nsp3:723-1037 revealed a ubiquitin fold 
(residues 723-783; UB2) and a well-ordered papain-like protease 
catalytic domain (residues 784-1036; PL2P'°) (Ratia et al., 2006). 
The catalytic domain adopts the canonical “thumb, palm and fin- 
gers” domain architecture. The thumb domain is formed by four 
prominent helices, the palm is made up of a six-stranded B-sheet 
that slopes into the active site, which is housed in a solvent-exposed 
cleft between the thumb and palm domains, and a four-stranded, 
twisted, anti-parallel B-sheet makes up the “fingers” domain. Two 
B-hairpins at the fingertips region contain four cysteine residues, 
which coordinate a zinc ion with tetrahedral geometry. Mutational 
analysis of the zinc-coordinating cysteines of SARS-CoV PLP"°, that 
zinc-binding ability is essential for structural integrity and protease 
activity (Barretto et al., 2005). PL2P'° has several structural homo- 
logues from the cysteine protease superfamily, the most significant 
being USP14 and HAUSP, both of which are cellular DUBs. The 
active site of PLP’° consists of a catalytic triad of cysteine, histidine, 
and aspartic acid residues, consistent with catalytic triads found 
in many PLP'° domains. The recent structure of the TGEV PL1P'° 
demonstrates that the coronavirus-like PLP’ folds have a common 
architecture (Wojdyla et al., 2010), and likely arose through gene 
duplication. 


3.7.2. Function 

It has been demonstrated that an LXGG motif at the P4—P1 pos- 
itions of the substrate is essential for recognition and cleavage by 
PL2P'° (Barretto et al., 2005; Han et al., 2005). There appear to be 
no preferences for the P’ positions or for residues N-terminal to P4. 
It is not surprising then that PLP’° is able to cleave after the four 
C-terminal residues of ubiquitin, LRGG. As predicted by Sulea et al. 
(2005) SARS-CoV PL2P'° (nsp3 residues 1507-1858) does possess 
de-ubiquitinating activity (Barretto et al., 2006; Lindner et al., 
2005) in addition to its better-known cysteine protease activity. 
The specific deubiquitinating enzyme inhibitor, ubiquitin aldehyde, 
inhibited its activity at a K, of 210 nM. 

Interestingly, a number of cellular deubiquitinases, including 
full-length USP14 and Ubp6, possess an N-terminal ubiquitin-like 
domain. Although the significance of this domain in these pro- 
teins is not well established, it has been demonstrated that the 
presence of the ubiquitin-like domain in USP14 and Ubp6 serves a 
regulatory function by mediating interactions between these deu- 
biquitinases and specific components of the proteasome (Hu et al., 
2005; Leggett et al., 2002). Comparisons of deubiquitinase activities 
between wild-type and mutant Ubp6 lacking the UbI domain reveal 
that these associations are responsible for a 300-fold increase in 
catalytic rate and serve to activate the enzyme (Leggett et al., 2002). 
It is intriguing to consider that the Ubl-like domain of PLP'° may 
instead act as a sort of “decoy” or “lure” to detract cellular ubiq- 
uitinating enzymes from other viral proteins, or it may mediate 
protein-protein interactions between the replicase components. 

While the role of PL2?' in polyprotein processing is well 
understood, the physiological significance of its deubiquitinating 
activity in the viral replication cycle is still not completely clear. 
However the conserved structural protein E is readily ubiquitin- 
ated in infected cells, suggesting that deubiquitination may be 
important in the assembly process (Alvarez et al., 2010). There 
is now mounting evidence that PL2P'° interferes with interferon 
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transcriptional activation pathways by inactivating TBK1, blocking 
NF-kappaB signaling and preventing translocation of IRF3 to the 
nucleus (Frieman et al., 2009; Wang et al., 2011; Zheng et al., 2008). 


3.8. NAB and GSM 


3.8.1. Structure 

The region between the PL2P'° domain and the transmembrane 
region of nsp3 does not show sequence similarity to any known 
domain. Disorder prediction programs Disembl 1.4, FoldIndex and 
RONN predict a long disordered stretch of residues in the cen- 
tral region of the segment in SARS-CoV, suggesting that it may 
be a consist of two domains with the second domain beginning 
from around residue 1226. A NMR structural study confirmed that 
the NAB is an independently folded functionally active unit. The 
solved region comprised residues 1066 to 1181 and the constructs 
nsp3(1066-1203) and nsp3(1035-1181). This globular domain 
represents a new fold, with a parallel four-strand B-sheet holding 
two a-helices of three and four turns that are oriented antiparal- 
lel to the beta-strands. Two antiparallel two-strand B-sheets and 
two 349-helices are anchored against the surface of this barrel-like 
molecular core. A positively charged patch on the molecule sur- 
face was identified by NMR ascontaining the nucleic acid binding 
activity. 


3.8.2. Function 

The NAB has been demonstrated to form homodimers upon 
incubation at 37°C (Neuman et al., 2008), and displayed a high 
affinity for nucleic acid. While the NAB was able to interact with 
both single-stranded and double-stranded nucleic acids, cooling 
the protein-nucleic acid complex released single-stranded RNA, 
demonstrating that the NAB may function as a ssRNA binding pro- 
tein with RNA chaperone-like activity (Neuman et al., 2008). Little 
else is known about the function of NAB in the viral replication 
cycle, or about the structure and function of the GSM domain that 
follows or the conserved hydrophobic, non-transmembrane region 
that immediately precedes the first transmembrane region of nsp3. 


3.9. TM, ZF and Y 


3.9.1. Structure 

The region of nsp3 after the TM domain is highly conserved in 
all CoVs, but this region has not been structurally characterized yet. 
An Fold and Function Annotation System search (FFAS; Jaroszewski 
et al., 2005) using the sequence from the SARS-CoV RBD to the 
end of nsp3 domain, TM domain and Y domain reveals three of 
seven significant hits (with expect values of —8 or better) to viral 
RdRp proteins, which may hint at the evolutionary origin of nsp3, 
which comprises nearly one fifth of most coronavirus genomes. 
The level of conservation in the Y domain in particular approaches 
levels consistent with the other enzymatic domains of nsp3, and 
exceeds the conservation of other domains that are believed to be 
non-enzymatic (Neuman et al., 2008). 


3.9.2. Function 

It appears that domains from PL2P'° to the Y domain have not 
undergone significant deletion or rearrangement during coronavi- 
rus evolution, while other nsps like nsp1, nsp2, and the N-terminal 
regions of nsp3 clearly have evolved by duplication and deletion of 
domains (Neuman etal., 2008). Therefore nsp3 is more likely to con- 
fer a basic and important function in a variety of hosts. UB1, SUD and 
RBD bind RNA, and ADRP is part of the RNA-processing machinery. If 
not for the proteinase(s), nsp3 would be classified exclusively as an 
RNA binding/modifying protein. These regions have been shown to 
change the localization of nsp4 (Hagemeijer et al., 2011), and cause 
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a membrane proliferation phenotype in transfected cells (Angelini 
et al., 2013). 

The topology of nsp3 leaves only one domain, ZF, on the lumenal 
side of the membrane. Ifnsp3 participates directly in the membrane 
pairing exhibited in cells transfected with SARS-CoV nsp3 and nsp4 
(Angelini et al., 2013), then the ZF domain likely participates in this 
interaction. 


3.10. Nsp4 


3.10.1. Structure 

Nsp4 is a transmembrane protein with four transmembrane 
helices and an internal C-terminal domain (Oostra et al., 2007). 
Coronavirus nsp4 is approximately 500 amino acids in length, 
and is the only part of the viral polyprotein that is released after 
processing by both the PLP’° and MP'°. The location and topology 
of the four transmembrane regions has been mapped (Oostra et al., 
2007). 

The C-terminal domain of nsp4 is conserved in all known coro- 
naviruses, but deletion of this domain from a MHV infectious clone 
resulted in only slightly attenuated virus growth, consistent with 
a non-essential function virus (Sparks et al., 2007). Mutation of 
the two glycosylation sites in nsp4, however, led to defective DMV 
formation and attenuation (Gadlage et al., 2010). The structure of 
the C-terminal domain of FCoV nsp4 has been reported (Fig. 4). 
It consists of two small antiparallel B-sheets and four a-helices 
(Manolaridis et al., 2009). 


3.10.2. Function 

SARS-CoV Nsp4 is an essential component for the formation of 
viral double-membrane vesicles (Angelini et al., 2013). Intracellu- 
lar expression studies have demonstrated a biological interaction 
between the carboxyl-terminal region of MHV nsp3 (Hagemeijer 
et al., 2011), and co-expression of full-length SARS-CoV nsp3 and 
nsp4 results in extensive membrane pairing, in which the paired 
membranes are held at the same distance as observed in authentic 
DMVs (Angelini et al., 2013). Nsp4 has also been shown to inter- 
act with nsp2 in a yeast two-hybrid screen (von Brunn et al., 2007), 
and to interact with other nsp4 molecules in cells (Hagemeijer et al., 
2011).Nsp4 has been shown to cause aberrant DMV formation upon 
mutation, leading to a loss of nsp4 glycosylation (Gadlage et al., 
2010; Sparks et al., 2007) 


3.11. Nsp5 


3.11.1. Structure 

MP'°, a chymotrypsin like protease is encoded within the mature 
polypeptide nsp5. It emerges by self trans-cleavage at nsp4/5 and 
5/6 boundaries at residues 323g VLQ)SGF3243 and 3544TFQ) GKF3549 
of polyprotein polyprotein 1a/1ab. It belongs to the C30 family 
of endopeptidases and is responsible for cleavage at 11 sequence 
specific sites within polyprotein 1a/1ab. The resultant “mature” 
protein products (nsp4—16) assemble into components of the repli- 
cation complexes. Given its paramount importance in replicase 
processing and therefore its role in viral replication, this protein 
has been extensively studied both from structural and functional 
perspectives (reviewed in Hilgenfeld et al., 2006; Ziebuhr et al., 
2000). 

Based on both structure and sequence characteristics, nsp5 can 
be divided into three domains. This domain prediction has been 
confirmed by the numerous crystal structures. It is conserved in all 
coronaviruses, indeed in all three nidoviral groups and several other 
RNA viruses that share common polyprotein processing scheme 
(Ziebuhr et al., 2000). The sequence is related to chymotrypsin-like 
protease superfamily of endopeptidases. 


Atlas of coronavirus replicase structure. Virus Res. (2013), 


621 


653 


674 
675 
676 
677 
678 
679 
680 
681 
682 
683 
684 
685 
686 
687 
688 
689 
690 
691 
692 
693 
694 
695 
696 
697 
698 
699 
700 
701 
702 
703 
704 
705 
706 
707 
708 
709 
710 
711 
712 
713 
714 
715 
716 


117 
718 
719 
720 
721 
722 
723 
724 
725 


726 


GModel 
VIRUS 96152 1-18 


8 B.W. Neuman et al. / Virus Research xxx (2013) xxx-xxx 


Box 3: Key nsp5 structures 


Virus Domain Method Accession Reference 


HCoV-229E Mpre 
HCoV-HKU1 mpre 
HCoV-HKU4 mere 
IBV Mpre 
HCoV-NL63 mpre 
SARS-CoV Mpre 
TGEV mpre 


X-ray (2.5 A) 1P9S 
X-ray (2.5A)  3D23 
X-ray (1.6 A) 2YNB 
X-ray (2.0A)  206F 
X-ray (1.6 A) 3TLO 
X-ray (1.9A) 1UJ1 
X-ray (2.0 A) 1LVO 


Anand et al. (2003) 
Zhao et al. (2008) 
Awaiting publication 
Xue et al. (2008) 
Awaiting publication 
Yang et al. (2003) 
Anand et al. (2002) 


Structures of MP'° from five different coronaviruses have been 
reported (Box 3; Fig. 4), and structures for HCoV-NL63 and HKU4 
viruses have been released prior to publication. All these structures 
show stringent conservation of a three-domain tertiary architec- 
ture and a partially surface exposed catalytic core. 

The first two domains (N terminal 8-101 amino acids of domain 
1 and 102-184 form domain 2; (Yang et al., 2003)) are duplicated 
closed B-barrels of type with n=6 and S=8 (Murzin et al., 1995) in 
which the strands are arranged in a greek key motif — all hallmarks 
of trypsin-like serine protease fold as defined in SCOP structure 
classification database and has been placed under “viral cysteine 
protease of the trypsin fold” family. Close homologs of this family 
include picornvirus-like 3C cysteine proteases. 

The critical role of the first seven residues at the N terminus 
in dimerization and its close proximity to the active site results in 
this enzyme to be an obligate dimer, although modification of the 
termini appears to modulate higher order oligomerization (Zhang 
et al., 2010). Deletion of the first five amino acids results in com- 
plete inactivation of this enzyme. The helical C-terminal domain 
III mediates homodimerization of coronaviral MP" proteases. This 
interaction is believed to be important for its trans-proteolytic 
activity. The active site is located at the interface of the two B- 
barrels with the catalytic residues H41 and C144 being contributed 
by domains 1 and 2 respectively. 

The active site is located in a substantially solvent exposed 
cleft that is located between the two B-barrels. Structures of MP'° 
complexed with peptide/peptidomimetic substrates reveal that the 
substrate peptides occupy the S1 and S2 pockets in an anti-parallel 
B-sheet orientation to the two interacting B-strands of the enzyme 
active site - a feature seen in subtilisin and related serine pro- 
teases. Sequence specificity is mainly determined by the S1 binding 
pocket. All coronaviral MP'° recognize a glutamine as the P1 residue, 
a feature that is largely determined because of structural compli- 
mentarity. The wall of P1 pocket is lined by residues His 163, Phe 
140 (that contribute sidechains) and Met-165, Glu-166, and His- 
172 contributing main chain atoms (Anand et al., 2003; Yang et al., 
2003). Comparison of ligand bound and apo structures have shown 
that unlike the S1 pocket (which houses the P1 residue) that largely 
remains unchanged, the S2 pocket undergoes significant conforma- 
tional changes upon ligand binding. The specificity for leucine being 
the most common residue at P2 position (see Table 1) was struc- 
turally explained by a ligand fit-induced structural ordering of the 
S2 pocket by Yin and coworkers (Yin et al., 2007). 


3.11.2. Function 

The most well-studied of all SARS proteins, nsp5, also known as 
the main protease (MP'°) or in older literature as the chymotrypsin- 
3C like proteast (3CLP°) is the primary molecule responsible for 
cleaving and maturation of SARS polyprotein Polyprotein 1a and 
Polyprotein 1ab. As part of its proteolytic activity, it is destined 
to interact with all the non-structural proteins from nsp4 to 16, 
presumably near its catalytic site. 

It cleaves the polyproteins at 11 sequence specific cleavage sites 
(Table 1). The other three (nsp1/2, 2/3 and 3/4) are cleaved by the 
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PLP'°. It is specific for a glutamine followed by a small hydrophobic 
residue, usually an alanine or a glycine, sometimes a serine. 

Hsu and others (Hsu et al., 2005) have described a four step pro- 
cess by which a mature catalytically competent MP'° head-to-head 
dimer is formed from multiple copies of polyprotein Polyprotein 1a. 
(Lin et al., 2004) have dissected its trans cleavage (by ELISA based 
assays) and cis peptide cleavage activities by cell based assays. 

Unlike the canonical chymotrypsin-like proteases, MP'° houses 
a conserved catalytic dyad-residues His 41 and Cys 45 and not the 
more common catalytic triad of cysteine proteases. Instead, from 
the structures, it is apparent that the role of the third catalytic 
residue has been taken over by a conserved water molecule that 
often lies within hydrogen bonding distance of H41. The role of a 
conserved “catalytic” water molecule has long been recognized in 
many proteolytic cleavage schemes, especially in serine proteases 
(Peronaetal., 1993). The mechanism of substrate hydrolysis is how- 
ever similar to its cysteine protease cousins, in which the acylation 
(the first step) is performed by His 41, which acts as a general 
base and helps the sulfur atom of the catalytic cysteine residue’s 
sidechain to carry out the nucleophilic attack on the backbone C=O 
group of the peptide bond to be cleaved (Yin et al., 2007). The first 
transition state is a tetrahedral intermediate (TI-1). The next step 
of the cleavage cycle is the implosion (collapse) of this transition 
state and leaving of the C-terminal half of the peptide product from 
the active site. At this stage, the other portion of the peptide sub- 
strate is covalently bound to the enzyme via a thio-ester linkage. 
In the other half of the reaction cycle (the de-acylation step), a 
water molecule that is activated by His41 acts as a nucleophilic 
hydroxyl ion (OH), attacks the carbonyl atom of the thioester and 
releases the N-terminal half of the peptide product, thus regenera- 
ting the cysteine. Many excellent and exhaustive studies have based 
this mechanism of catalysis and the structure of the catalytic site 
architecture to design several different classes of peptidomimetic 
inhibitors targeted against MP!° of coronaviruses (including SARS) 
and other pathogenic viruses. Several MP'° inhibitors have also 
been structurally characterized (Akaji et al., 2011; Bacha et al., 
2008; Chu et al., 2006; Chuck et al., 2013; Grum-Tokars et al., 2007; 
Lee et al., 2007; Lee et al., 2009; Lee et al., 2005; Shan and Xu, 2005; 
Shao et al., 2007; Turlington et al., 2013; Verschueren et al., 2008; 
Wei et al., 2006; Yang et al., 2006; Yang et al., 2003; Yang et al., 
2007; Zhang et al., 2010; Zhu et al., 2011). 


3.12. Nsp6 


3.12.1. Structure and function 

The membrane topology of nsp6 has been determined (Oostra 
et al., 2008). Although SARS-CoV nsp6 is predicted by TAHMM2.0 
(Krogh et al., 2001) to contain seven transmembrane regions, only 
six of these function as membrane-spanning helices. The pres- 
ence of additional non-transmembrane hydrophobic domains near 
authentic transmembrane domains is a common theme running 
through the DMV making proteins nsp3, nsp4 and nsp6. IBV and 
SARS-CoV nsp6 have been shown to activate autophagy, induc- 
ing vesicles containing Atg5 and LC3-II (Cottam et al., 2011). MHV 
Nsp6 is relocalized when it is co-expressed with nsp4 (Hagemeijer 
et al., 2012), suggesting that the two proteins interact. Nsp6 has 
also been shown to interact with nsp2, nsp8, nsp9 and sars9b via 
yeast two-hybrid assays (von Brunn et al., 2007). 


3.13. Nsp7 and Nsp8& 
See Box 4. 


3.13.1. Structure 
Nsp7 and nsp8 are two mature proteins that emerge due 
to cleavage of polyprotein Polyprotein 1a at 3g34TVQ)SKM3839 
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Box 4: Key nsp7 and nsp8 structures 


Virus Domain Method Accession Reference 

SARS-CoV nsp7 NMR 1YSY Peti et al. (2005) 
SARS-CoV nsp7 +nsp8 X-ray (2.6 A) 2AHM Zhai et al. (2005) 
FCoV nsp7 +nsp8 X-ray (2.4A) 3UB0 Xiao et al. (2012) 


(nsp6/7), 3917TLQ)AIA3922 (nsp7/8) and 4115 KLQ)NNEq4120 (nsp8/9) 
boundaries. SARS-CoV Nsp7 and nsp8 self associate and form a large 
sixteen-subunit supercomplex that has been directly implicated in 
replication (Bartlam et al., 2005; Zhai et al., 2005). 

The structure of nsp7 was first determined by NMR (Peti 
et al., 2005) which revealed the presence of a four helical bun- 
dle arranged in a novel sheet-like arrangement, with three of the 
helices arranged anti-parallel to each other while the fourth ori- 
ented at an angle to the bundle. Much of the structure derived 
functional information about nsp7 and nsp8 came from the study 
by Rao and co-workers (Bartlam et al., 2005; Zhai et al., 2005) 
who determined the 2.4 A resolution crystal structure of the nsp7/8 
supercomplex (Fig. 5). Eight subunits of nsp7 and nsp8 each form a 
tight hexadecameric complex. In this complex, nsp7 reveals a ter- 
tiary structure that is similar to its solution structure with the minor 
deviation in that the fourth helix is oriented at a slightly different 
angle and is more ordered. This is possibly due to its existence as 
a complex with nsp8 in crystals. SARS-CoV Nsp8 adopts two major 
conformations described as the “golf club” and the “bent golf club” 
fold, which has an extended long shaft domain with three helices 
(one of which is very long) and a globular core at the C terminus 
(Zhai et al., 2005). 

The supercomplex, which is formed by a stoichiometric associa- 
tion of eight subunits each of nsp7 and nsp8, is a hollow cylindrical 
structure with a central channel, and two handles (one on either 
side of the structure) has a very distinct bimodal distribution 
of electrostatic charge on its surface in which the outer skin of 
the complex is composed of predominantly negatively charged 
residues while the inner core channel is lined with positively 
charged sidechains. RNA binding studies using gel mobility shift 
assays suggest that the function of the central positively charged 
channelis to preferentially guide dsRNA through the supercomplex 
either towards the polymerase (nsp12) or away from it during repli- 
cation. Mutagenesis experiments indicate that residues R26 and 
K32 of nsp7 and K77, R80, K63, R84 and R85 are among those that 
line the channel and are primarily responsible for this translocation 
(Zhai et al., 2005). 

The FCoV nsp7 and nsp8 proteins were recently shown to adopt 
similar structures to the SARS-CoV equivalents (Fig. 5), but in a dis- 
tinctive 2:1 protein complex (Xiao et al., 2012). No known homologs 
exist for either of these proteins outside of coronaviridae lineage 
within statistical limits of significance. 

SCOP places nsp7 as a member of the “immunoglobulin/albumin- 
binding domain-like” fold, with three of its helices arranged as a 
bundle and having an overall topology that mirrors spectrin-like 


Table 1 
Cleavage sites of SARS MP'° using Tor2 as the reference SARS strain. 


fold. The globular core domain of Nsp8 golf-club has been defined 
as anew fold. 


3.13.2. Function 

The most striking functional insight on the supercomplex 
has been obtained by Canard and co-workers who have shown 
that coronaviral nsp8 encodes a second non-canonical RNA poly- 
merase activity (Imbert et al., 2006). This template-dependent 
oligonucleotide-synthesizing activity, which is dependent on Mn2* 
or Mg2* cations, was found to be preferentially enhanced by inter- 
nal 5’-(G/U)CC-3’ trinucleotides that are present on RNA templates 
and were used to initiate the synthesis of complementary oligonu- 
cleotides. Typical extension products were found to be <6 residues 
long. Nsp8 effectively polymerized poly(rC) and oligo(rC,5) tem- 
plates and poly(rU) to a weaker extent but not poly(rA). This 
accessory polymerase, which is both catalytically weaker and has 
a lesser fidelity than the main viral RdRp (nsp12), was potently 
inhibited with 3’-dGTP and to a lesser extent by ddGTP and 
2’-O-methyl-GTP suggesting an avenue for possible therapeutic 
inhibition. The primase activity of nsp8 was blocked by N-terminal 
extension of nsp8 with peptides other than nsp7 (te Velthuis et al., 
2012). 

Initial mutagenesis experiments on nsp8 implicated four 
residues K58, R75, K82 and S85 to be essential for polymerization, 
but a more recent study also identified a magnesium ion bind- 
ing site at D50 and D52 corresponding to functional a D/E-x-D/E 
motif (te Velthuis et al., 2012). All of these residues localize on 
the long a-helix (the stem of the golf club) and map onto one 
of several dimer interface regions of the supercomplex. The main 
function of nsp8 overall appears to be to catalyze the synthe- 
sis of short stretches of RNA primers that can be utilized by 
the primer-dependent main SARS polymerase nsp12. Using dual 
labeling immunofluorescence microscopic studies, Prentice and co- 
workers have shown that nsp8 co-localizes along with nsp2 and 
nsp3 in cytoplasmic complexes, which also contain the protein LC3, 
which is a general marker for autophagic vacuole (Prentice et al., 
2004). 

Astriking observation by Masters and co-workers (Zust et al., 
2007b) provided compelling evidence that nsp8 specifically inter- 
acts with a molecular switch composed of a bulged stem-loop and 
an RNA pseudoknot that exists in the 300 untranslated region of 
MHYV, indeed most coronaviral genomes. These studies are aiding 
in developing a model that explains the origins and initiation of 
negative strand genomic RNA synthesis (te Velthuis et al., 2012). 

Yeast two-hybrid screening and co-immunoprecipitation 
experiments, which were subsequently confirmed by in-vivo co- 
localization studies by Lal and co-workers have shown that nsp8 
interacts with sars6 gene product as well, thereby implicating this 
accessory protein in the replication complex (Kumar et al., 2007). 
In a proteome-wide yeast two-hybrid screening study nsp8 was 
found to be one of the most promiscuous non-structural protein, 
which interacted with no less than 13 out of 29 SARS proteins 
tested (von Brunn et al., 2007). 


.. .P3,P2,P1)P-1,P-2,P-3».... 


.. P3,P2,P1)P-1,P-2,P-3... 


Nsp4/5 3238 VLQ|SGF3243 
Nsp5/6 3544 T FQ.) GKF 3549 
Nsp6/7 3834 TVQ/SKM3839 
Nsp7/8 3917 TLQYAIA3922 
Nsp8/9 4115KLQ)NNEa4120 
Nsp9/10 4228 RLQ| AGNa233 


Nsp11/12 4367 LMQJSAD4372 
Nsp12/13 5299 VLQ/AVGs304 
Nsp13/14 5900 TLQAENs905 
Nsp14/15 6427RLQ|SLE¢432 

Nsp15/16 6773 KLQLASQ6778 
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Fig. 5. Structures of the coronavirus replicase proteins nsp7, nsp8 and nsp9. Structures for the nsp7 and nsp8 heterodimers from SARS-CoV (2AHM; panel A) and FCoV (3UBO0; 
panel B) are shown to illustrate the distinctive structures of these proteins. The structure of homodimeric SARS-CoV nsp9 is taken from PDB entry 1QZ8 and is shown in panel 


Cc 


3.14. Nsp9 


See Box 5. 


3.14.1. Structure 

Two groups have independently determined the structure of 
nsp9 (Egloff et al., 2004; Sutton et al., 2004). It adopts a B-barrel 
fold with a C-terminal o-helix (Fig. 5). The structure of HCoV-229E 
nsp9 was subsequently solved, and found to contain a similar fold 
(Ponnusamy et al., 2008). Nsp9 of both viruses was found to be 
dimeric, although the dimer interface was stabilized by an addi- 
tional disulfide bond in HCoV-229E nsp9. 


3.14.2. Function 

Nsp9 binds ssRNA and dsDNA in a concentration dependent 
manner (Egloff et al., 2004; Sutton et al., 2004). Optimal bind- 
ing occurs with 45-mer oligonucleotides, consistent with binding 
occurring by the nsp9 dimer wrapping the DNA fragment around 
itself once (Egloff et al., 2004). Since RNA-binding is not sequence 
specific, nsp9 may protect nascent ssRNA from nucleases during 
viral RNA synthesis, given its natural abundance in the infected 
cell (Egloff et al., 2004). Nsp9 colocalizes in the perinuclear region 
along with other components of the replication complex (Bost et al., 
2000). While the precise role of nsp9 in viral replication is not yet 
clear, Minkis and co-workers investigated the role of the dimer 
interface demonstrated that SARS-CoV nsp9 is essential for efficient 
viral growth (Miknis et al., 2009). 


3.15. Nsp10-11 


3.15.1. Structure 

The tenth coronavirus nonstructural protein constitutes the 
carboxyl-terminal conserved domain of replicase polyprotein 
polyprotein 1a, and a region of this polyprotein homologous to 
nsp10 can be readily identified in all coronaviruses (Joseph et al., 
2006). Originally described as a growth factor-like protein on the 
basis of high cysteine content and sequence homology (Gorbalenya 
et al., 1989), nsp10 is a highly conserved component of the corona- 
virus replicase machinery. Reciprocal BLAST searches using various 
nsp10 homologs identify a region of homology near the carboxyl 
terminus of polyprotein 1a in more distantly related nidoviruses 
such as torovirus and the newly identified white bream virus 
(Schutze et al., 2006). However, no region of nsp10 homology has 


Box 5: Key nsp9, nsp10 and nsp11 structures 


Virus Domain Method Accession Reference 


HCoV-229E nsp9 NMR 2J597 
SARS-CoV nsp9 X-ray (2.6A) 10Z8 
SARS-CoV nsp10 X-ray (1.8A) 2FYG 
SARS-CoV nsp10+nsp11_ X-ray (2.1A) 2G9T 


Ponnusamy et al. (2008) 
Egloff et al. (2004) 
Joseph et al. (2006) 

Su et al. (2006) 
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been noted to date in the ronivirus or arterivirus lineages of the 
Nidovirales. Bioinformatic analysis of nsp10 does not yield any 
consistent matches to conserved enzymatic signatures. Thus, the 
profile of nsp10 more likely fits a role as an auxiliary replicase 
component, rather than an essential replicase enzyme. 

X-ray crystallography structures (Bhardwaj et al., 2006; Joseph 
et al., 2006) have revealed that nsp10 is a single domain pro- 
tein consisting of a pair of antiparallel N-terminal helices stacked 
against an irregular B-sheet, a coil-rich C terminus, and two Zn fin- 
gers. As such, nsp10 represents a novel fold, as might be expected 
from the lack of protein or domain homology to other known pro- 
teins. Bacterially expressed nsp10 binds generic single-stranded 
and double-stranded nucleic acids with micromolar affinity (Joseph 
et al., 2006). 

Within the polyprotein, coronavirus nsp10 is followed by a short 
peptide of highly variable sequence that maps to the region of the 
genomic RNA where the ribosomal frameshift signal leading to the 
translation of the replicase enzyme cluster in open reading frame 
1b is located. In SARS-CoV, nsp11 is a 13-residue peptide which 
can theoretically be processed from the C-terminus of polyprotein 
la, however processing of nsp11 has not been demonstrated in 
infected cells. The structure of the uncleaved nsp10-11 polypeptide 
showed some differences in oligomerization and crystal packing, 
but little difference in the core nsp10 structure (Bhardwaj et al., 
2006). In that study the nsp11 density was flexibly disordered 
(Bhardwaj et al., 2006). Thus, nsp11 more likely forms part of an 
essential translation reading frame shift mechanism, and is unlikely 
to significantly influence the function of nsp10. Synthesized nsp11 
peptide is fairly insoluble in aqueous buffers (J. Joseph, unpublished 
data). 


3.15.2. Function 

The first assignment of a function to nsp10 was noted from a 
study of MHV strains that contained temperature-sensitive lesions 
affecting viral RNA synthesis (Sawicki et al., 2005). It was further 
noted that this defect in nsp10 could not be compensated in cells by 
co-infection with viruses harboring temperature-sensitive lesions 
in nsp4 or nsp5, suggesting that coronavirus polyprotein 1a (at least 
from nsp4 onward) forms a single functional unit important for 
coronavirus discontinuous negative-strand RNA synthesis (Sawicki 
et al., 2005). Mutagenesis studies have confirmed the importance 
of nsp10 for general RNA synthesis and for controlling the ratio of 
subgenomic to genomic RNA (Donaldson et al., 2007b). Deletion 
of nsp10 or rearrangement of the genes encoding nsp7-10 com- 
pletely inhibited virus growth, while alteration of the MP"° cleavage 
site between nsp9 and nsp10 reduced viral growth (Deming et al., 
2007). An unexpected finding was that the temperature sensitive 
lesion in nsp10 correlates with a severe inhibition of MP"° activity 
at the non-permissive temperature (Donaldson et al., 2007a). From 
these results it appears clear that the function of nsp10 is closely 
tied to viral RNA synthesis. Nsp10 is now known to form part of the 
viral mRNA cap methylation complex (Bouvet et al., 2010) which is 
discussed below with the viral methyltransferase subunits. 
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3.16. Nsp12 


3.16.1. Structure 

Nsp12 is the cleavage product of the replicase polyprotein 
polyprotein lab and is produced by the action of MP". It is 932 
residues long in the SARS-CoV. Immunoblotting and immunofluo- 
rescence analyses indicated that full length, 106-kDa, RdRp protein 
is present in infected Vero cells and is part of the viral replication 
cycle of (Prentice et al., 2004). 

Cheng et al. (Cheng et al., 2005) expressed nsp12 in E. coli and 
found that during purification, the protein was cleaved into three 
stable fragments (1-110, 111-368, 369-932), which may corre- 
spond to separate domains. 

Nsp12 has high sequence identity with other coronavirus RdRps, 
but very low similarity to other viral polymerases. Based on man- 
ual sequence alignments with other the RdRps of poliovirus, rabbit 
hemorrhagic disease virus, hepatitis C virus, reovirus and bacterio- 
phage ®6 polymerases, as well as HIV-1 reverse transcriptase, Xu 
et al. (2003) were able to identify conserved sequence motifs and, 
by homology, to assign functions to these regions in the catalytic 
domain of the polymerase. 

Xu et al. (2003) built a three-dimensional model of the catalytic 
domain of nsp12 (PDB ID 105S), based on alignments of conserved 
motifs with other viral polymerase proteins. Based on the model of 
SARS-CoV nsp12, the catalytic domain forms the canonical “palm 
and fingers” domain. The fingers subdomain is predicted to span 
residues 376-584 and 626-679 and is predicted to consist of a- 
helices in the base and B-strands and coils at the tip (Xu et al., 
2003). Similar to the HCV and RHDV Rdkps, its fingers subdo- 
main also contains an N-terminal portion (residues 405-444) that 
forms a long loop starting from the fingertip that bridges the fin- 
gers and thumb subdomains. The palm subdomain of SARS-CoV 
nsp12 (residues 585-625 and 680-807) forms the catalytic core 
and contains the four highly conserved sequence motifs (A-D) 
found in all polymerases and a fifth motif (E) unique to RdRps 
and RTs (Poch et al., 1989). The core structure of the palm subdo- 
main is well conserved across all classes of polymerases. It consists 
of a central three-stranded B-sheet flanked by two a-helices on 
one side and a B-sheet and an a-helix on the other. Residues 
forming the catalytic active site are found within motifs A and 
C. 

The structures of RdRps of a few non-CoVs have been deter- 
mined - for example, those of hepatitis C virus, poliovirus, rabbit 
hemorrhagic disease virus, reovirus, bacteriophage ®6 and HIV-1 
(see Xu et al., 2003). However, there is very low sequence similarity 
between these structures and CoV RdRps. Xu et al. (2003) con- 
structed a comparative molecular model for SARS-CoV RdRp based 
on these structures, using manual sequence alignments anchored 
by conserved sequence motifs shared by all RdRps and reverse tran- 
scriptases. 


3.16.2. Function 

The RdRp is the central enzyme in the multi-component viral 
replicase complex that replicates the viral RNA genome (Bost et al., 
2000; Brockway et al., 2003) would contain several other viral pro- 
teins as well. The replicase transcribes (i) full-length negative and 
positive strand RNAs; (ii) a 3’-co-terminal set of nested subge- 
nomic mRNAs that have a common 5’ ‘leader’ sequence derived 
from the 5’ end of the genome; and (iii) subgenomic negative 
strand RNAs with common 5’ ends and leader complementary 
sequences at their 3’ ends (Lai, 2001; Thiel et al., 2003). Full- 
length nsp12 has RdRp activity. The “catalytic” 64kDa domain 
and the N-terminal 12kDa domain form a complex that pos- 
sesses comparable RdRp activity. However, the 64kDa domain in 
isolation has no activity. Cheng and coworkers suggest that the N- 
terminal domain is required for polymerase activity possibly via 
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involvement in template-primer binding (Cheng et al., 2005). Sni- 
jder and co-workers were able to confirm that the full-length nsp12 
has robust, primer-dependent RNA polymerase activity (te Velthuis 
et al., 2010), a finding generally confirmed by the later study of Ahn 
et al. (2012). 

There has been some success in the use of inhibitors of viral 
polymerases as therapeutics. Hence, the RdRp is an attractive drug 
target. Lu et al. used a short RNAi targeting the RdRp and found that 
it significantly reduced plaque formation of SARS-CoV in Vero-E6 
cells (Brockway et al., 2004). However, such an approach would 
affect expression of the entire polyprotein 1a/ab and would not be 
specific to the RdRp. He et al. (2004) showed that aurintricarboxylic 
acid could potently reduce viral titer by more than 1000-fold when 
added to cells in culture. The same group subsequently suggested, 
by analogy to other RNA polymerases, that this compound may 
act on nsp12, and performed docking studies to predict the site of 
binding (Yap et al., 2005). 

The main RdRp would be predicted to interact either directly or 
indirectly with several other viral proteins, including the nsp3-6 
scaffold proteins, the nsp10-16 methylation complex and the 
nsp 7-8 primase. Adedeji and coworkers showed that SARS-CoV 
nsp12 enhances the helicase activity of nsp13 by two-fold (Adedeji 
et al., 2012). Nsp12 interacts with nsp8, nsp13, sars3a and sars9b 
according to yeast two-hybrid experiments and with nsp8 by 
co-immunoprecipitation experiments. Previous immunolocaliza- 
tion and interaction studies in MHV have also indicated that 
in vivo, nsp12 may act in concert with numerous other viral 
proteins - counterparts of SARS nsp1, 2, 5, 8, 9, 13 and sars9a 
(Bost et al., 2000; Brockway et al., 2003; von Brunn et al. 
2007). 


3.17. Nsp13 


3.17.1. Structure and function 

Nsp13 is a helicase capable of unwinding both RNA and DNA 
duplexes in a 5’-to-3’ direction with high processivity(Ivanov et al., 
2004; Tanner et al., 2003). It possesses deoxynucleoside triphos- 
phatase (dNTPase) activity against all standard nucleotides and 
deoxynucleotides, and also RNA 5’-triphosphatase activity which 
may be involved in the first step of formation of the 5’ cap structure 
of the viral mRNAs (Ivanov et al., 2004; Tanner et al., 2003). The two 
hydrolase activities likely have a common active site, which con- 
tains a canonical Walker A NTPase-like motif (Ivanov et al., 2004). 
Since NTPase/helicase proteins are considered essential for viral 
viability (Kadare and Haenni, 1997), they are potential drug tar- 
gets (Anand et al., 2003; Holmes, 2003). Promising inhibitors are 
in trials for herpes simplex virus (Kleymann, 2003) and hepatitis 
C viral infections (Borowski et al., 2002). Several SARS-CoV heli- 
case inhibitors - bananin derivatives — have been identified (Tanner 
et al., 2005). 

While the structure of nsp13 has not yet been determined, 
the protein has been modeled based on the E. coli Rep ATP- 
dependent DNA helicase (PDB accession 1UAA). The model of the 
helicase domain at the position 80-568 of SARS-CoV nsp13 has 
been deposited (PDB accession 2G1F; Bernini et al., 2006). The 
N-terminus of nsp13 contains conserved cysteine and histidine 
residues that are probably homologous with the metal binding 
domains at the N-terminus of arterivirus helicases, which coordi- 
nate up to four Zn2* (van Dinten et al., 2000). 

IBV Nsp13 also has a proposed role in modulating the host 
response, although it is not yet clear whether this role is conserved 
in other coronaviruses (Xu et al., 2011). Overexpression of nsp13 
led to cell cycle arrest by interfering with DNA polymerase delta, 
though the report did not determine whether this effect occurs 
normally during viral infection. 
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3.18. Nsp14 


3.18.1. Structure 

SARS-CoV nsp14 is 527 residues long and is multifunctional. The 
structure of nsp14 has not yet been solved. All coronaviruses have 
homologues of nsp14, all containing an N-terminal domain with 3’ 
to 5’ exonuclease motifs I (DE), II (D) and III (D) within the first ~280 
residues of the protein (Moser et al., 1997; Thiel et al., 2003; Zuo 
and Deutscher, 2001) and a C-terminal cap N7-methyltransferase 
domain (Chen et al., 2013; Minskaia et al., 2006). Compared to other 
RNA ExoNs, CoV and torovirus ExoNs have an additional putative 
Zn finger between Exo | and II motifs (Thiel et al., 2003). 

The coronaviral N7-methyltransferase is unusual in that it 
is physically and functionally linked with the exoribonuclease 
domain (Chen et al., 2013). Most of the residues known to be 
essential for methylation are located around the sequence motif 
DxGxPxA at positions 331-338 of nsp14, which is predicted to form 
the S-adenosyl-t-methionine binding pocket of the methyltrans- 
ferase domain. 


3.18.2. Function 

Arteriviruses, which are related to nidoviruses except that they 
are about two-fold smaller, do not have an ExoN homologue or a 
homologue of either of the viral methyltransferases. This seems to 
indicate that this enzyme in CoVs is required for stable synthesis of 
exceptionally large RNA templates (Minskaia et al., 2006). 

SARS-CoV nsp14 has 3’—5’ exonuclease activity on both ssRNA 
and dsRNA (Minskaia et al., 2006). Recombinant nsp14 (as a maltose 
binding protein fusion) hydrolyzed ssRNA to a~12 nucleotide prod- 
uct. When the DE-D-D residues are substituted with alanine, this 
activity was abolished or greatly impaired; D9OA/E92A andH268A 
mutants had very low activity while N238A, D243A and D273A 
had undetectable activity (Minskaia et al., 2006). DSRNA also sig- 
nificantly enhanced exonuclease activity of the enzyme (Minskaia 
et al., 2006). DNA and ribose-2’-O-methylated RNA are resistant 
to cleavage (Minskaia et al., 2006). The activity of nsp14 is strictly 
dependent on divalent cations (Chen et al., 2007a). Its activity was 
highest in the presence of Mg2* or Mn?*, lower in the presence 
of low amounts of Zn?* (0.5mM) and undetectable with Ca2* or 
higher concentrations of Zn2*. With Mn", the size of the product 
is slightly smaller than that obtained with Mg?* or Zn2*, indicating 
that the metal ions may modulate the configuration of the active 
site differently (Chen et al., 2007a; Minskaia et al., 2006). 

In MHV, nsp14 greatly enhances replication fidelity, essential 
for the replication and stability of the unusually large CoV genome 
(Eckerle et al., 2007). Recombinant viruses with mutations in the 
nsp14 active site were defective in growth and RNA synthesis 
and possessed 15-fold more mutations than wild-type viruses. 
Nsp14 therefore appears to play a role in error prevention or repair 
of nucleotide incorporation during RNA synthesis (Eckerle et al., 
2007). Recombinant HCoV-229E containing mutations in the active 
site of nsp14 had severe defects in RNA synthesis and no viable virus 
could be recovered. Besides strongly reduced genome replication, 
specific defects in sg RNA synthesis, such as aberrant sizes of spe- 
cific sg RNAs and changes in the molar ratios between individual 
sg RNA species, were observed (Minskaia et al., 2006). Sperry et al. 
(Eckerle et al., 2006; Sperry et al., 2005) have shown that a Tyr His 
mutation (equivalent to SARS-nsp14 Tyr420His) in an infectious 
clone of MHV-A59 shows attenuated virus replication and viru- 
lence in mice, also arguing for the importance of this protein as a 
proof-reading component of the viral replication machinery. Based 
on temperature-sensitive mutants of MHV, Sawicki et al. (2005) 
showed that nsp14 is essential for the assembly of a functional 
replicase-transcriptase complex and appears to affect the positive- 
strand synthesis, as would be expected for a protein involved in 
both capping and mismatch repair. 
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Box 6: Key nsp15 and 16 structures 


Virus Domain Method Accession Reference 

MHV nsp15 X-ray (2.7A) 2GTH Xu et al. (2006) 
SARS-CoV nsp15 X-ray (2.6A) 2H85 Ricagno et al. (2006b) 
SARS-CoV nsp10+nsp16 X-ray (2.0A) 2XYO Decroly et al. (2011) 


Nsp14 interacts with nsp10 and nsp16 to form the viral cap 
methylation complex, as described in more detail under nsp16 
below. Y2H and co-immunoprecipitation studies suggest that 
nsp14 may also interact with nsp8 and sars9b (von Brunn et al., 
2007). 


3.19. Nsp15 


See Box 6. 


3.19.1. Structure 

Nsp15 of SARS-CoV is a 346-residue polypeptide that results 
from the cleavage of polyprotein 1ab at sites g427RLQ{SLE¢432 and 
6773 KLQ)ASQ677g3 by MP'°, It is one of the most well studied RNA 
processing enzyme of the coronaviral replicase with several recent 
studies focusing on its structural and functional characterization 
due to its potential importance as a drug target. Studies on HCoV- 
229E and equine arteritis virus have shown that inactivating this 
enzyme by site-directed mutagenesis renders these viruses non- 
viable. This enzyme is a specific marker for coronaviruses as no 
known homologs of nsp15 exists among other RNA viruses outside 
of nidovirales. Nsp15 preferentially cleaves the 3’ end of uridy- 
lates of RNA at GUU or GU sequences to produce molecules with 
2’-3’ cyclic phosphate ends (Bhardwaj et al., 2004). It acts on both 
double-stranded RNA and single-stranded RNA (ssRNA) and its 
activity is dependent on the presence of Mn?* ions (Bhardwaj et al., 
2004; Guarino et al., 2005). The ion binds only weakly but nonethe- 
less produces substantial conformational changes in the active site 
loops (Bhardwaj et al., 2004; Bhardwaj et al., 2006). 

Several groups have characterized the structure of nsp15, both 
by cryoEM (Guarino et al., 2005) and X-ray crystallography from 
SARS-CoV (Bhardwaj et al., 2007; Guarino et al., 2005; Joseph 
et al., 2007; Ricagno et al., 2006a,b; Xu et al., 2006), and MHV (Xu 
et al., 2006) and its eukaryotic homolog, XendoU from Xenopus Iae- 
vis (Renzi et al., 2006). The coronaviral structures have revealed 
a three-domain architecture (Fig. 6). Again not surprisingly, the 
catalytic C-terminal domain contains a novel fold. The first two 
domains (residues 1-190) have a topological similarity to methyl- 
transferases forming a ‘spitting image’ of the SAM-dependent 
methyltransferase fold as defined in SCOP database (Murzin A, 
personal communication; Fig. 6). The full length MHV and SARS 
nsp15 enzymes were shown to be packed as hexamers, their bio- 
logically relevant oligomeric state, forming a hollow, toroid shaped 
structure. Hexamerization is absolutely essential for both metal ion 
binding and catalytic activity (Guarino et al., 2005). The eukaryotic 
homolog XendoU from X. levis is much shorter (missing the first two 
domains) and shares only the conserved catalytic domain. In fact, 
it was the first structure of this endoribonuclease fold to be struc- 
turally characterized. It is a functional monomer in solution. The 
catalytic center of nsp15 retains features that resemble the active 
site of an unrelated nuclease, RNase A (Cuchillo et al., 2011). 


3.19.2. Function 

While the enzymatic activity of nsp15 is now fairly well 
understood, the role of nsp15 in the coronavirus replication 
cycle is not. Nsp15 cleaves at uridylates preceded by cytidy- 
late or adenylate residues. When model RNA substrates were 
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Fig. 6. Structures of the replicase proteins nsp10, 15 and 16. The hexameric structure of SARS-CoV nsp15 comes from PDB entry 2H85 and is shown in panel A. The structure 
of the SARS-CoV nsp10-16 shown in panel B complex is taken from PDB entry 2XYQ (nsp10 is shown on the left). 


2'-O-ribose methylated, it blocked endonucleolytic activity, indi- 
cating a possible functional link between nsp15 and the 2’-O-ribose 
methyltransferase nsp16. A structure based catalytic mechanism of 
endonucleolytic activity of nsp15 was proposed by Ricagno et al. 
(2006b), based on active site similarities with RNAse A. In this 
model, Lys-289, His-249, and His-234 residues act as the main cat- 
alytic triad while Ser-293 and Tyr-342 provide the supporting role 
by stabilizing the aromatic ring of the nucleotide. Despite structural 
uniqueness of nsp15, the actual mode of sessile bond cleavage is 
thought to be very similar to those of several RNAses of this class 
that share the same catalytic triad e.g., RNAse T1, RNAse A and oth- 
ers. The six active sites of the hexamer are spatially segregated and 
are thought to function independent of one another. The actual 
electrostatic contribution of the Mn2* ion in catalysis is unclear. 
In XendoU, Mn2* does not impede either RNA substrate binding of 
cleavage (Renzi et al., 2006). However, fluorescence experiments 
indicate that upon metal ion binding, the protein undergoes large 
structural transitions suggesting an indirect, possibly structural 
role for metal in either stabilizing the enzyme in a catalytically com- 
petent “on” state from an otherwise inactive “off” state (Renzi et al., 
2006). Drawing analogies with endonuclease EndN, a Zn dependent 
enzyme, Ricagno and co-workers have hypothesized that, given its 
proximity to the catalytic site, Y342 might be the residue involved 
in Mn2+ ion binding by forming a cation-II interactions, assisted 
by the histidine H249 and the 2’-O-ribose moiety of the substrate. 
Molecular modeling and docking studies led Renzi et al. (2006) to 
propose a similar mechanism for endonucleolytic cleavage by the 
eukaryotic homolog XendoU, wherein the 04 of UMP nucleotide 
forms a potential hydrogen bond with the catalytic histidine H178 
and the pyrimidine ring of the nucleotide involved in stacking inter- 
action with the aromatic ring of tyrosine Y280. 

The structure of a truncated form of nsp15 from SARS-CoV, 
that was lacking the N terminal hexamerization domain, revealed 
striking changes in the active site loops in the catalytic domain - 
suggesting allosteric control of endonucleolytic activity and provid- 
ing a direct link between oligomerization and function (Josephetal., 
2007). In this structure, which lacked the first 27 amino acids of 
nsp15, a dramatic shift was noticed in the active site loop (residues 
234-249, referred to as the “active site loop” spanning the two 
active site histidines H234 and H249) that was flipped by as much 
as ~120° into the active site cleft. In the full-length nsp15 hexamer, 
the “active site loop” and the “supporting loop” are packed against 
each other and are stabilized by intimate interactions with residues 
contributed by the adjacent monomer. 


3.20. Nsp16 


3.20.1. Structure 
nsp16 lies at the C-terminal end of polyprotein Polyprotein 
lab and results when MP'° cleaves the polyprotein at nsl15/16 
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junction (Snijder et al., 2003). Although first identified in flavi- and 
reoviruses (Koonin, 1993) about two decades ago, the role of viral 
methyltransferases in viral replication has been only now begun to 
be explored systematically (Gorbalenya et al., 2006). After rigorous 
sequence and structure analysis using 3D-jury based metaserver 
prediction methods, Richlewski and co-workers noticed a strong 
but remote homology between SARS nsp16 and an ancient family 
of S-adenosyl-L-methionine (SAM) dependent 2’-O-ribose methyl- 
transferases enzymes (von Grotthuss et al., 2003; Ferron et al., 
2002). The sequence of SARS MTase has features that place it 
in the viscinity of the RrmJ/fibrillarin superfamily of 2’-O-ribose 
methyltransferases (Feder et al., 2003). 

The structure of SARS-CoV nsp16 has been determined as part 
of an nsp16-nsp10 complex by two groups independently (Chen 
et al., 2011; Decroly et al., 2011). Nsp16 adopts a canonical S- 
adenosyl-l-methionine dependent methyltransferase fold, with a 
central beta sheet framed by a helical clamp and a conserved 
catalytic KDKE tetrad (Martin and McMillan, 2002). The nsp16 
topology matches those of the dengue virus NS5 methyltransferase 
(Egloff et al., 2002) and vaccinia virus VP39 O-methyltransferase 
(Hodel et al., 1996). The structure of the nsp16/nsp10 interac- 
tion interface shows that nsp10 interacts with and probably helps 
to stabilize the S-adenosyl-l-methionine binding pocket. This has 
the effect of making the putative RNA-binding groove of nsp16 
longer. The study by Decroly et al. (2011) also demonstrated 
that the methyltransferase inhibitor sinefungin interacts with the 
nsp16 active site, and could therefore form the basis of a new 
generation of inhibitors that attack the coronavirus methylation 
process. The structure of nsp10 was found to be virtually iden- 
tical when solved in the presence and absence of nsp16 (Chen 
et al., 2011; Decroly et al., 2011; Joseph et al., 2006; Su et al., 
2006). 


3.20.2. Function 

Nsp16 has been shown to interact with nsp10 and nsp14 to 
form a viral cap methylation complex (Bouvet et al., 2010). All 
eukaryotic MRNAs posses this modified guanosine at the 5’ termi- 
nus, a feature that confers protection against degradation by host 
nucleases. First reported in the early 70s (Gingras, 2009), the “cap” 
structure and has been found to be present in almost all eukaryotic 
viral RNAs. The generic nomenclature that’s been widely adopted 
is m7G®))ppp)xX0™py(™ where m7G corresponds to the modi- 
fied 7-methylguanosine nucleotide. O-methyltransferases such as 
nsp16 perform the final step of cap synthesis, which involves 
adding a methyl group to the first nucleotide following the m’G, 
and sometimes adding a methyl group at the same position on 
subsequent nucleotides. While the m’G cap is essential for effi- 
cient translation splicing, nuclear export, translation and stability of 
eukaryotic mRNA, O-methylation is not (Cougot et al., 2004; Lewis 
and Izaurralde, 1997; Schwer et al., 1998). 
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Zust et al. (2011) explored the role of nsp16 methylation in the 
viral replication cycle and found that O-methylation acts as a recog- 
nition marker that helps the host cell to recognize its own RNA 
species, and respond to incompletely methylated cap structures. 
Nsp16 makes replication possible by camoflauging newly synthe- 
sized viral RNA to resemble host mRNA, which therefore blocks 
the induction of an interferon response. This suggests that drugs 
that act on nsp16 have the potential to interfere with viral replica- 
tion both at the level of inhibition of the replication process, and 
in promoting intracellular recognition and response to viral RNA 
species. 


4. Conclusion - a Galapagos of new folds 


New coronavirus protein structures continue to have important 
implications for virus biology and antiviral design. An unexpected 
benefit of the coronavirus structure boom is a wealth of previously 
undiscovered protein folds. The protein data bank keeps records 
of the discovery of new protein structures over time, and cur- 
rently classifies all known protein structures into fewer than 1500 
folds. 

Since 2003, 18 out of 28 coronavirus proteins encompassing a 
total of 27 domains have been determined experimentally. Several 
of these have been described as “new folds” commonly defined as 
one with sufficiently different fold/topology based on comparison 
methods (DALI), fold classification schemes (CATH and SCOP) and 
family assignment schema of PFAM. By these criteria 16 out of the 
27 domains are indeed new folds - a striking rate of fold discovery, 
when compared to the ~10% for model pro- and eukaryotes being 
reported by structural genomics centers. 

Why do coronaviruses possess an abundance of new folds? 
One obvious reason might be that these structures have been 
relatively unexplored, and therefore under-represented in PDB. 
This is the first proteome-scale structural characterization of a 
coronavirus, and one with a disproportionately large number of 
singletons. The new folds are significantly contributed by the 
16 nonstructural proteins of the replicase machinery, several of 
which do not have counterparts outside Nidovirales. Ideally, new 
folds enable us to model sequence homologues, thereby filling 
out the immediate neighborhood in structure space. This is non- 
trivial for SARS-CoV proteins, since the new folds are (so far) 
either true sequence singletons (nsp1, nsp2 nsp3a, sars9b) or 
found exclusively in Coronaviridae (nsp4, 7, 8, 9, 10, 15, spike RBD, 
sars9a). 

Fast mutation rates in viruses may encourage divergent samp- 
ling of fold space (Andreeva and Murzin, 2006). This, along with 
oligomerization has been proposed to be major facilitators of 
fold evolution (Andreeva and Murzin, 2006) allowing a protein to 
morph to a new fold analogous to structural drift (Krishna and 
Grishin, 2005) Elucidation of new folds, especially for isolated 
groups of divergent homologues should help in improving fold 
recognition and comparative modeling algorithms. 

These observations also have ramifications in evolution of new 
viral strains, a phenomenon which is the result of two antagonis- 
tic forces: greater adaptability within an ecological niche (because 
of intrinsically fast mutation rates) and increased evolutionary 
constraints due to their small genomes (Holmes and Rambaut, 
2004). Overall, we are left with the piquant notion that proteins 
in viral proteomes may probably occupy a unique niche in fold 
space and coronaviruses, a peculiar island in this niche. Viruses 
are the most diverse biological entities on this planet and second 
only to prokaryotes in terms of sheer biomass. While the diversity 
of protein structures they represent certainly defies imagination, 
our understanding of protein folds and their migration in tertiary 
fold space may well be locked up in them. 
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Appendix A. Supplementary data 


Supplementary data associated with this article can be 
found, in the online version, at http://dx.doi.org/10.1016/ 
j.virusres.2013.12.004. 
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